Sains Malaysiana 52(7)(2023): 1901-1914

http://doi.org/10.17576/jsm-2023-5207-01

 

RFE-Based Feature Selection to Improve Classification Accuracy for Morphometric Analysis of Craniodental Characters of House Rats

(Pemilihan Ciri Berasaskan RFE untuk Meningkatkan Ketepatan Pengelasan dalam Analisis Morfometri Sifat Kraniodental Tikus Rumah)

 

ANEESHA BALACHANDRAN PILLAY1, DHARINI PATHMANATHAN1*, ARPAH ABU2 & HASMAHZAITI OMAR2

 

1Institute of Mathematical Sciences, Faculty of Science, Universiti Malaya, 50603 Kuala Lumpur, Malaysia

2Institute of Biological Sciences, Faculty of Science, Universiti Malaya, 50603 Kuala Lumpur, Malaysia

 

Received: 24 October 2022/Accepted: 26 June 2023

 

Abstract

In conventional morphometrics, researchers often collect and analyze data using large numbers of morphometric features to study the shape variation among biological organisms. Feature selection is a fundamental tool in machine learning which is used to remove irrelevant and redundant features. Recursive feature elimination (RFE) is a popular feature selection technique that reduces data dimensionality and helps in selecting the subset of attributes based on predictor importance ranking. In this study, we perform RFE on the craniodental measurements of the Rattus rattus data to select the best feature subset for both males and females. We also performed a comparative study based on three machine learning algorithms such as Naïve Bayes, Random Forest, and Artificial Neural Network by using all features and the RFE-selected features to classify the R. rattus sample based on the age groups. Artificial Neural Network has shown to provide the best accuracy among these three predictive classification models.

 

Keywords: ANN, machine learning, naïve Bayes, recursive feature elimination, traditional morphometrics

 

Abstrak

Dalam morfometri konvensional, para penyelidik sering mengumpul dan menganalisis data dengan menggunakan bilangan ciri yang besar untuk mengkaji variasi bentuk antara organisma biologi. Pemilihan ciri memainkan peranan penting dalam pembelajaran mesin algorithma untuk mengeluarkan ciri-ciri yang tidak relevan dan berlebihan. Penghapusan ciri rekursif (RFE) merupakan kaedah pemilihan ciri terkenal yang boleh mengurangkan dimensi data dan juga boleh membantu memilih subset sifat berdasarkan kedudukan kepentingan peramal. Dalam kajian ini, kita menjalankan RFE pada ukuran kraniodental linear bagi data Rattus rattus untuk memilih subset ciri terbaik bagi kedua-dua tikus jantan dan betina. Kita telah menjalankan kajian perbandingan berdasarkan tiga algoritma pembelajaran mesin seperti Bayes Naif, Hutan Rawak dan Rangkaian Neural Tiruan menggunakan semua ciri dan ciri terpilih secara RFE untuk mengelaskan sampel R. rattus berdasarkan kumpulan umur. Setelah memantau hasil nilai ketepatan yang diperoleh bagi ketiga-tiga modal tersebut, Rangkaian Neural Tiruan telah terbukti memberi ketepatan yang terbaik antara ketiga-tiga model ini.

 

Kata kunci: ANN; Bayes naif; morfometri tradisi; pembelajaran mesin; penghapusan ciri rekursif

 

 

 

REFERENCES

Abdelhady, A.A. & Elewa, A.M.T. 2010. Evolution of the upper cretaceous oysters: Traditional morphometrics approach. In Lecture Notes in Earth Sciences 124: 157-176. Springer Verlag. https://doi.org/10.1007/978-3-540-95853-6_6

Alamoudi, M.O., Abdel-Rahman, E.H. & Hassan, S.S.M. 2021. Ontogenetic and sexual patterns in the cranial system of the brown rat (Rattus norvegicus Berkenhout, 1769) from Hai’l Region, Kingdom of Saudi Arabia. Saudi Journal of Biological Sciences 28(4): 2466-2475. https://doi.org/10.1016/j.sjbs.2021.01.048

Apao, N.J., Feliscuzo, L.S. Sta. Romana, C.L.C. & Tagaro, J. 2020. Multiclass classification using random forest algorithm to prognosticate the level of activity of patients with stroke. International Journal of Scientific & Technology Research 9: 1233-1240. 

Balakirev, A.E., Abramov, A.V. & Rozhnov, V.V. 2011. Taxonomic revision of Niviventer (Rodentia, Muridae) from Vietnam: A morphological and molecular approach. Russian Journal of Theriology 10(1): 1-26. https://doi.org/10.15298/rusjtheriol.10.1.01 

Bermejo, J.F., Juan F. Gómez Fernández, Fernando Olivencia Polo, and Adolfo Crespo Márquez. 2019. A review of the use of artificial neural network models for energy and reliability prediction. A study of the solar PV, hydraulic and wind energy sources. Applied Sciences (Switzerland) 9(9): 1844. MDPI AG. https://doi.org/10.3390/app9091844 

Brace, C.L. & Hunt, K.D. 1990. A nonracial craniofacial perspective on human variation: A(Ustralia) to Z(Uni). American Journal of Physical Anthropology 82(3): 341-360. https://doi.org/https://doi.org/10.1002/ajpa.1330820310 

Breno, M., Leirs, H. & Van Dongen, S. 2011. Traditional and geometric morphometrics for studying skull morphology during growth in Mastomys natalensis (Rodentia: Muridae). Journal of Mammalogy 92(6): 1395-1406. https://doi.org/10.1644/10-MAMM-A-331.1 

Chaudhary, A., Kolhe, S. & Kamal, R. 2016. An improved random forest classifier for multi-class classification. Information Processing in Agriculture 3(4): 215-222. https://doi.org/https://doi.org/10.1016/j.inpa.2016.08.002 

Chuanromanee, T.S., Cohen, J.I. & Ryan, G.L. 2019. Morphological analysis of size and shape (MASS): An integrative software program for morphometric analyses of leaves. Applications in Plant Sciences 7(9): e11288. https://doi.org/10.1002/aps3.11288 

Darst, B.F., Malecki, K.C. & Engelman, C.D. 2018. Using recursive feature elimination in random forest to account for correlated variables in high dimensional data. BMC Genetics 19(1): 65. https://doi.org/10.1186/s12863-018-0633-8 

Denisko, D. & Hoffman, M.M. 2018. Classification and interaction in random forests. Proceedings of the National Academy of Sciences 115(8): 1690-1692. https://doi.org/10.1073/pnas.1800256115 

Esselstyn, J.A., Achmadi, A.S., Handika, H. & Rowe, K.C. 2015. A hog-nosed shrew rat (Rodentia: Muridae) from Sulawesi Island, Indonesia. Journal of Mammalogy 96(5): 895-907. https://doi.org/10.1093/jmammal/gyv093 

Gholamy, A., Kreinovich, V. & Kosheleva, O. 2018. A pedagogical explanation a pedagogical explanation part of the computer sciences commons. https://scholarworks.utep.edu/cs_techrephttps://scholarworks.utep.edu/cs_techrep/1209

John, C.R. 2022. Package ‘MLeval’. Machine Learning Model Evaluation. 

Kassambara, A. & Mundt, F. 2020. Extract and visualize the results of multivariate data analyses [R Package Factoextra Version 1.0.7]. 

Kuhn, M. 2008. Building predictive models in R using the Caret package. Journal of Statistical Software 28(5). https://doi.org/10.18637/jss.v028.i05 

​Li, J., Cheng, K., Wang, S., Morstatter, F., Trevino, R.P., Tang, J. & Liu, H. 2017. Feature selection: A data perspective. ACM Computing Surveys 50(6): 94. https://doi.org/10.1145/3136625 

Libois, R., Ramalhinho, G., da Luz Mathias, M., Santos-Reis, M., Fons, R., Petrucci-Fonseca, F., Oom, M. & Collares-Peirera, M. 1996. First approach on the skull morphology of the black rat (Rattus rattus) from Terceira and São-Miguel Islands (Azores Archipelago). Vie et Milieu 46(September): 245-251. 

Mas, J.F. & Flores, J.J. 2008. The application of artificial neural networks to the analysis of remotely sensed data. International Journal of Remote Sensing. Taylor and Francis Ltd. https://doi.org/10.1080/01431160701352154

Misra, P. & Yadav, A.S. 2020. Improving the classification accuracy using recursive feature elimination with cross-validation. International Journal on Emerging Technologies 11(3): 659-665. 

Mohamad Ikbal, Nurul Huda, Dharini Pathmanathan, Subha Bhassu, Khanom Simarani & Hasmahzaiti Omar. 2019. Morphometric analysis of craniodental characters of the house rat, Rattus rattus (Rodentia: Muridae) in Peninsular Malaysia. Sains Malaysiana 48(10): 2103-2111. https://doi.org/10.17576/jsm-2019-4810-05 

Motokawa, M., Lin, L-K. & Lu, K-H. 2004. Geographic variation in cranial features of the polynesian rat Rattus exulans (Peale, 1848) (Mammalia: Rodentia: muridae). The Raffles Bulletin of Zoology 52(2): 653-663. 

Musser, G.G. & Newcomb, C. 1983. Malaysian Murids and the Giant Rat of Sumatra. Bulletin of the American Museum Natural History 174: Article 4. 

Musser, G., Lunde, D. & Son, N. 2009. Description of a new genus and species of rodent (Murinae, Muridae, Rodentia) from the Tower Karst Region of Northeastern Vietnam. American Museum Novitates 3517(September): 1-41. https://doi.org/10.1206/0003-0082(2006)3517[1:DOANGA]2.0.CO;2 

R Core Team. 2020. R: A language and environment for statistical computing.  R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/.

Sammut, C. & Webb, G. 2010. Encyclopedia of Machine Learning. Boston: Springer US. https://doi.org/10.1007/978-0-387-30164-8

Smith, F.H. 1991. Skull shapes and the map: Craniometric analysis in the dispersion of modern homo. By W.W. Howells, Vol. 79, Papers of the Peabody Museum of Archaeology and Ethnology. Cambridge: Harvard University Press. 1989. 187. American Journal of Physical Anthropology 86(1): 89-90. https://doi.org/https://doi.org/10.1002/ajpa.1330860110                               

Tan, J., Chang, S-W., Abdul Kareem, S., Yap, H.J. & Thai, Y-K. 2018. Deep learning for plant species classification using leaf vein morphometric. IEEE/ACM Transactions on Computational Biology and Bioinformatics 17(1): 82-90. https://doi.org/10.1109/TCBB.2018.2848653. 

Tang, Y., Horikoshi, M. & Li, W. 2016. Ggfortify: Unified interface to visualize statistical results of popular R packages. The R Journal 8(2): 474. https://doi.org/10.32614/RJ-2016-060 

Timm, R.M., Weijola, V., Aplin, K.P., Donnellan, S.C., Flannery, T.F., Thomson, V. & Pine, R.H. 2016. A new species of Rattus (Rodentia: Muridae) from Manus Island, Papua New Guinea. Journal of Mammalogy 97(3): 861-878. https://doi.org/10.1093/jmammal/gyw034 

Wolfer, A., Ebbels, T. & Cheng, J. 2022. Package ‘SantaR’. Short Asynchronous Time-Series Analysis. 

Wu, B. 1992. An introduction to neural networks and their applications in manufacturing. Journal of Intelligent Manufacturing 3(6): 391-403. https://doi.org/10.1007/BF01473534 

  

*Corresponding author; email: dharini@um.edu.my